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Many  a  statistician  or  serious  stu¬ 
dent  of  statistics  will  recall  his  or 
her  introduction  to  the  great 
George  Box  quote:  “All  models 
are  wrong;  some  are  useful."  Some 
of  us  first  saw  it  in  the  response  surfaces  text  written 
by  Box  and  Draper  (1987),  but  the  quote  is  actually 
attributed  to  Box  in  one  of  his  much  earlier  texts  (Box 
1979;  Box  and  Draper  1987). 

We  know  from  Dr.  Box’s  good  teachings  that  to 
improve  the  credibility  of  using  a  regression  model  the 
modeler  should  prove  a  number  of  assumptions  (e.g., 
independence  in  observations,  distribution  of  residuals 
is  normal  around  0)  and  should  test  that  the  model 
does  not  significantly  overfit  or  underfit  the  data  (i.e., 
lack-of-fit  tests).  While  the  exact  tests  used  for  these 
measures  do  not  precisely  port  to  the  validation  of 
modeling  and  simulation  (M&S),  the  goal  of  reporting 
the  observation  is  a  healthy  one:  Treat  the  validation  of 
Department  of  Defense  (DoD)  M&S  more  as  a  science 
and  less  as  an  art. 

This  article  reports  on  a  number  of  efforts  intended 
to  provide  the  policy  infrastructure  and  some  initial 
tools  to  help  move  the  DoD  in  that  direction. 
Specifically,  we  report  on  recent  changes  to  verifica¬ 
tion,  validation,  and  accreditation  (W&A)  policy 
(DODI  5000.61);  recent  and  ongoing  activities  to 
advance  W&A  (e.g.,  reporting  standards,  methodol¬ 
ogies  in  light  of  risk,  W&A  planning  and  reporting 
tools);  and  a  series  of  practical  application  efforts 
being  conducted  with  the  test  and  evaluation  (T&E) 
community  to  fully  assess  the  viability  of  these  evolu¬ 
tions  in  policy,  methodology,  standards,  and  tools. 
This  “W&A  Campaign  Plan”  is  being  managed  by 
the  Office  of  Secretary  of  Defense’s  Modeling  & 
Simulation  Coordination  Office  (M&S  CO)  and  its 
technical  lead,  the  Applied  Physics  Lab  at  the  Johns 
Hopkins  University;  however,  at  conclusion,  it  will 
have  involved  managers  and  practitioners  from  across 
the  acquisition  and  T&E  communities. 


Department  of  Defense  Instruction 
5000.61  (DoDI  5000.61) 

In  2007,  the  DoD  M&S  Steering  Committee 
embarked  on  a  rewrite  of  the  policy  guiding  W&A  in 
the  Department,  DoDI  5000.61,  entitled  “DoD  Mod¬ 
eling  and  Simulation  Verification,  Validation,  and 
Accreditation  (W&A)”  (DoD  2009).  That  revision 
was  completed  and  vetted  in  late  2009.  The  policy  overall 
has  been  in  existence  for  over  13  years.  This  document 
defines  policy,  assigns  responsibilities,  prescribes  proce¬ 
dures,  and  establishes  common  terminology  relative  to 
VV&A  across  the  Department.  Initially  issued  in  April 
1996,  the  instruction  was  reissued  in  May  2003  and, 
most  recently,  in  December  2009.  The  most  significant 
changes  to  the  current  version  of  the  instruction  were: 

•  modifying  the  document  format  to  align  with 
updated  format  requirements  for  DoD  issuances; 

•  streamlining  the  document  by: 

q  pulling  the  responsibilities  section  to  an 
enclosure, 

o  focusing  the  procedure  section  on  documen¬ 
tation  requirements  only, 
o  synthesizing  documentation  procedures  by 
eliminating  duplication  of  information  re¬ 
quirements, 

o  retaining  only  essential  definitions; 

•  modifying  the  document  to  reference  MIL- 
STD-3022  (DoD  2008),  the  DoD  Standard 
Practice  for  the  Documentation  of  W&A  for 
Models  and  Simulation. 

DoDI  5000.61  provides  high-level  or  “umbrella” 
policy  to  which  component-based  (e.g.,  the  Services;  the 
Director,  Operational  Test  and  Evaluation)  and  orga¬ 
nizational  (e.g.,  the  Navy’s  Commander  Operational 
Test  and  Evaluation  Force)  policies  align.  As  defined  in 
DoDI  5000.61,  it  is  DoD  policy  that: 

•  M&S  used  to  support  DoD  processes,  products, 
and  decisions  shall  undergo  V&V  throughout 
their  lifecycles  and  shall  be  accredited  for  an 
intended  use; 
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•  W&A  results  shall  be  documented  and  made 
accessible; 

•  each  DoD  component  will  be  the  final  authority 
for  validating  representations  of  its  own  forces 
and  capabilities; 

•  each  DoD  is  authorized  to  provide  W&A 
procedures  and  guidance  based  on  the  intended 
use  and  risk  of  use  of  the  M&S. 

In  addition  to  the  implementation  of  the  policy 
statements  listed  above,  DoD  components  have  the 
responsibility  to  ensure  that  V&V  resources  are 
provided  throughout  M&S  development,  modifica¬ 
tion,  or  use. 

Activities  to  advance  W&A  methodology 

The  DoD  M&S  Steering  Committee  has  sponsored 
several  VV&A-focused  tasks  with  the  objective  of 
advancing  the  state  of  the  practice.  These  tasks  include 
the  development  of  a  risk-driven  W&A  methodology 
as  well  as  automated,  consistent  documentation 
formats.  These  tasks  are  defined  below. 

Risk-based  W&A  (RBA) 

The  objective  of  the  risk-based  W&A  (RBA) 
methodology  task  is  to  optimize  W&A  resource  use 
while  minimizing  the  risks  of  using  a  model  or 
simulation.  Use  risk  arises  from  error  and  uncertainties 
in  the  representations  of  the  model  or  simulation  as 
well  as  the  consequences  that  arise  from  a  decision 
predicated  on  M&S  results.  While  W&A  seeks  to 
provide  evidence  relative  to  M&S  error  and  uncertain¬ 
ty,  most  efforts  operate  in  a  resource-constrained 
environment.  As  a  consequence,  W&A  processes 
must  be  tailored  to  optimize  resource  use  within  those 
constraints,  to  accommodate  the  alignment  of  W&A 
with  the  simulation  life  cycle,  and  to  address  the 
specific  priorities  of  an  intended  use. 

During  the  initial  (planning)  phase  of  the  process, 
the  RBA  methodology  provides  a  way  to  tailor  while 
focusing  on  those  aspects  of  the  M&S  that  carry  the 
greatest  risk  relative  to  the  intended  use.  At  the  back 
end  of  the  process,  the  RBA  methodology  provides  a 
framework  in  which  to  articulate  risk  in  terms  of  the 
uncertainties  and  limitations  associated  with  the  M&S 
capabilities. 

The  RBA  task  is  in  the  second  year  of  a  two-year 
effort.  A  report  titled  “An  Approach  for  Realizing  a 
Risk-Based  W&A  (RBA)  Methodology”  was  com¬ 
pleted  in  March  2010. 

MIL-STD-3022  and  the  W&A 
Documentation  Tool 

MIL-STD-3022,  “Department  of  Defense  Standard 
Practice  Documentation  of  Verification,  Validation, 


and  Accreditation  (W&A)  for  Models  and  Simula¬ 
tions,”  was  developed  by  the  Modeling  and  Simulation 
Coordination  Office  in  coordination  with  the  military 
departments.  It  establishes  templates  for  the  four  core 
products  (Accreditation  Plan,  V&V  Plan,  V&V  Report, 
and  Accreditation  Report)  of  the  M&S  verification, 
validation,  and  accreditation  processes.  The  intent  of  this 
standard  is  to  provide  consistent  documentation  that 
minimizes  redundancy  and  maximizes  reuse  of  informa¬ 
tion.  This  promotes  a  common  framework  and  interfacing 
capability  that  can  be  shared  across  all  M&S  programs 
within  the  Department  of  Defense,  other  government 
agencies,  and  allied  nations.  MIL-STD-3022  was 
approved  as  a  Military  Standard  on  June  28, 2008. 

In  addition  to  the  template  standard,  the  U.S.  Navy 
has  led  the  production  of  a  tool  (the  DoD  W&A 
Documentation  Tool)  that  automates  the  MIL-STD- 
3022  templates.  At  this  time,  access  to  the  tool  requires 
a  Common  Access  Card  or  External  Certification 
Authority. 

While  these  tasks  seek  to  improve  the  efficiency  and 
effectiveness  of  W&A  implementation,  technical  gaps 
still  exist  that  impact  that  efficiency  and  effectiveness. 
In  an  attempt  to  identify  and  address  these  gaps,  the 
M&S  Steering  Committee  has  implemented  a  multi¬ 
prong  approach.  The  first  prong  addresses  known 
W&A  gaps  that  impact  successful  implementation  of 
the  RBA  methodology  such  as  the  development  of  a 
V&V  techniques  catalog  and  guidance  on  the  devel¬ 
opment  of  acceptability  criteria.  The  second  prong 
focuses  on  the  identification  of  additional  gap  areas 
through  a  Systems  Engineering  Research  Center-led 
study  focused  on  W&A  practitioners  in  the  field.  The 
third  tier  focuses  on  the  establishment  of  an  expert 
subcommittee  that  can  monitor,  review,  and  mature 
proposed  technical  advancements. 

Gauging  the  way  forward  in  support  of 
the  test  and  evaluation  community 

Fiscal  year  2010  and  2011  W&A  within  M&S  CO 
were  originally  planned  to  focus  on  the  evolution  of  a 
tiered  validation  methodology — not  dissimilar  to  the 
establishment  of  Technical  Readiness  Levels  for  M&S 
and  distributed  M&S  capabilities.  During  initial 
program  reviews  for  this  effort  and  its  linkages  to  the 
broader  set  of  DoD  W&A  activities,  it  was  deter¬ 
mined  that  the  current  W&A  investments  should  first 
be  assessed  against  the  needs  of  the  community — 
before  any  additional  policies  or  standards  or  method¬ 
ologies  were  crafted.  In  particular,  DoD  leadership 
wanted  to  assess  the  viability  of  W&A  efforts  from 
the  perspective  of  W&A  practitioners — the  organiza¬ 
tions  and  individuals  who  “do”  W&A  as  opposed  to 
just  managing  its  policy. 
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The  M&S  CO  and  the  Johns  Hopkins  University 
team  guiding  these  W&A  efforts  fully  understand 
that  the  best  policy,  tools,  methodologies,  templates, 
and  standards  do  not  provide  a  panacea  or  “magic  fix.” 
W&A  activities  in  support  of  the  T&E  community 
will  always,  by  definition,  be  challenging  because  the 
T&E  team  is  faced  with  several  inherent — and 
frequently  conflicting — facts: 

•  Test  activities  involve  new  or  enhanced  systems 
and  systems  of  systems,  which  typically  do  not 
have  a  rich  body  of  information  or  data  about 
their  performance,  design,  or  employment. 

•  T&E  in  the  era  of  “build  and  field  it  faster”  will 
put  even  greater  pressure  on  the  team  of 
evaluators,  testers,  and  technologists  performing 
the  W&A  activities.  The  increasing  complexity 
of  systems  being  fielded — and  the  need  to  use 
distributed  M&S  capabilities  fully  integrated 
with  the  systems/systems  of  systems  under  test 
realistically  replicate  operational  environments  or 
to  make  up  for  shortfalls  in  available  personnel 
and  equipment — make  failures  during  W&A 
even  more  visible  and  expensive. 

•  The  W&A  team  for  a  given  T&E  activity  is 
typically  drawn  from  several  organizations,  may 
be  required  to  use  a  broad  variety  of  M&S 
tools — familiar  and  unfamiliar — to  conduct  the 
test  events,  and  is  frequently  scattered  across  the 
country  and  only  converges  on  the  test  site  during 
pretest  integration  and  test  event. 

Over  the  next  18  months,  a  series  of  practical 
application  proofs  of  principle  activities  will  be  con¬ 
ducted  with  acquisition  and  testing  organizations 
within  all  four  military  departments.  The  first  proof- 
of-principle  event  will  focus  on  the  application  of 
W&A  early  in  the  systems  life  cycle — as  a  key  part  of 
the  systems  engineering  process — and  will  examine 
verification  methodologies  and  impacts  primarily  on 
early  developmental  and  technical  testing.  The  second 
proof  of  principle  will  focus  on  the  application  of  the 
W&A  policies,  standards,  templates,  etc.  described 
above  to  a  series  of  live-virtual-constmctive  experimen¬ 
tation  events  led  by  the  Air  Force,  with  the  goal 
of  examining  the  application  of  W&A  to  technically 
and  geographically  distributed  test  environments.  The 
third  proof-of-principle  activity  being  sponsored  as  part 
of  the  W&A  Campaign  Plan  will  be  the  application  of 
the  evolved  W&A  capabilities  to  an  ongoing  series  of 
vehicle  survivability  tests  being  conducted  primarily  by 
U.S.  land  forces  (Army  and  Marines). 

Use  case  1 — Systems  engineering  overview 

The  first  W&A  proof-of-principle  activity  will 
involve  the  Systems  Engineering  Research  Center,  a 


consortium  of  18  universities  led  by  the  Stevens 
Institute  of  Technology.  Systems  Engineering  Re¬ 
search  Center  researchers  have  been  tasked  to  look  at 
tools  and  process  that  will  facilitate  the  verification  of 
M&S  capabilities  (stand-alone  and  federated). 

Use  case  2 — AGILE  Fires  overview 

The  second  proof-of-principle  activity  will  be  led  by 
the  Air  Forces’  Simulation  and  Analysis  Facility  and 
will  focus  on  the  development  of  a  use  case  based  on 
the  AGILE  Fires  project.  This  use  case  will  specifically 
address  the  unique  challenges  encountered  in  perform¬ 
ing  W&A  of  a  live,  virtual,  and  constructive 
distributed  simulation  environment.  The  use  case  will 
emphasize  the  development  of  test  scenarios  and  will 
also  examine  the  integration  of  a  new  component  into 
an  established  distributed  environment  that  has  already 
undergone  W&A.  The  objective  will  be  to  use 
existing  methodologies  and  strict  configuration  control 
to  conduct  a  reverification,  revalidation,  and  reaccred¬ 
itation  of  the  scenario  by  assessing  only  what  changed. 

The  use  case  would  ultimately  define  a  consistent 
method  for  performing  and  documenting  the  W&A 
effort  of  a  system  of  systems  represented  in  a 
distributed  simulation  environment  in  order  to  estab¬ 
lish  confidence  in  using  live,  virtual,  and  constructive 
representations. 

Use  case  3 — Vehicle  survivability 
testing  overview 

This  effort  is  very  much  in  its  infancy  as  this  article  is 
being  written.  It  is  also,  by  far,  the  most  challenging — 
in  terms  of  classification,  timeliness,  and  impacts — of 
the  three  proofs  of  principle.  Members  of  the  M&S 
CO  W&A  team  will  gain  insights  into  its  evolved 
W&A  capabilities  while  also  providing  direct  support 
to  the  teams  doing  survivability  analysis. 

Closing  remarks 

We  have  addressed  a  number  of  initiatives  intended 
to  assist  W&A  practitioners  department-wide  with  a 
special  focus  on  the  T&E  community.  Clearly,  we  are 
on  the  verge  of  a  paradigm  shift  as  the  community 
moves  toward  more  complex,  next-generation  testing 
procedures.  And,  while  the  initiatives  in  this  article  are 
designed  to  help,  we  understand  that  they  are  but 
a  small  step  forward  in  a  journey  of  continuous 
improvements.  Future  improvements  must  focus  on 
the  use  of  advanced  statistics  (experimental  design 
principles)  to  make  the  most  of  every  test  and  machine 
learning  methods  to  work  toward  automating  some  of 
the  W&A  process.  Only  by  moving  W&A  from  art 
to  science  will  we  be  better  positioned  to  serve  our 
warfighters.  The  T&E  community  has  long  been  on 
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the  leading  edge  of  tackling  this  challenge,  and  we  look 
forward  to  your  continued  innovation  and  commit¬ 
ment  in  meeting  that  goal.  □ 
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